4 research outputs found
Modeling and Mapping Location-Dependent Human Appearance
Human appearance is highly variable and depends on individual preferences, such as fashion, facial expression, and makeup. These preferences depend on many factors including a person\u27s sense of style, what they are doing, and the weather. These factors, in turn, are dependent upon geographic location and time. In our work, we build computational models to learn the relationship between human appearance, geographic location, and time. The primary contributions are a framework for collecting and processing geotagged imagery of people, a large dataset collected by our framework, and several generative and discriminative models that use our dataset to learn the relationship between human appearance, location, and time. Additionally, we build interactive maps that allow for inspection and demonstration of what our models have learned
An Automatic Framework for Embryonic Localization Using Edges in a Scale Space
Localization of Drosophila embryos in images is a fundamental step in an automatic computational system for the exploration of gene-gene interaction on Drosophila. Contour extraction of embryonic images is challenging due to many variations in embryonic images. In the thesis work, we develop a localization framework based on the analysis of connected components of edge pixels in a scale space. We propose criteria to select optimal scales for embryonic localization. Furthermore, we propose a scale mapping strategy to compress the range of a scale space in order to improve the efficiency of the localization framework. The effectiveness of the proposed framework and the scale mapping strategy are validated in our experiments
iBARLE: imBalance-Aware Room Layout Estimation
Room layout estimation predicts layouts from a single panorama. It requires
datasets with large-scale and diverse room shapes to train the models. However,
there are significant imbalances in real-world datasets including the
dimensions of layout complexity, camera locations, and variation in scene
appearance. These issues considerably influence the model training performance.
In this work, we propose the imBalance-Aware Room Layout Estimation (iBARLE)
framework to address these issues. iBARLE consists of (1) Appearance Variation
Generation (AVG) module, which promotes visual appearance domain
generalization, (2) Complex Structure Mix-up (CSMix) module, which enhances
generalizability w.r.t. room structure, and (3) a gradient-based layout
objective function, which allows more effective accounting for occlusions in
complex layouts. All modules are jointly trained and help each other to achieve
the best performance. Experiments and ablation studies based on
ZInD~\cite{cruz2021zillow} dataset illustrate that iBARLE has state-of-the-art
performance compared with other layout estimation baselines
LASER: LAtent SpacE Rendering for 2D Visual Localization
We present LASER, an image-based Monte Carlo Localization (MCL) framework for
2D floor maps. LASER introduces the concept of latent space rendering, where 2D
pose hypotheses on the floor map are directly rendered into a
geometrically-structured latent space by aggregating viewing ray features.
Through a tightly coupled rendering codebook scheme, the viewing ray features
are dynamically determined at rendering-time based on their geometries (i.e.
length, incident-angle), endowing our representation with view-dependent
fine-grain variability. Our codebook scheme effectively disentangles feature
encoding from rendering, allowing the latent space rendering to run at speeds
above 10KHz. Moreover, through metric learning, our geometrically-structured
latent space is common to both pose hypotheses and query images with arbitrary
field of views. As a result, LASER achieves state-of-the-art performance on
large-scale indoor localization datasets (i.e. ZInD and Structured3D) for both
panorama and perspective image queries, while significantly outperforming
existing learning-based methods in speed.Comment: CVPR2022-Ora